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Serial No.: 09/698,341 

P410 of SEQm) NO: 2; and an alanine to threonine mutation at a site corresponding to A485 of 
SEQ ID NO: 3 



14. The DNA polymerase J)f*claim 6 that has reduced discrimination against a non-conventional 
nucleotide selected from tpg group consisting of: dideoxynucleotides, ribonucleotides and 
conjugated nucleotides. 




Please enter new claims 85-88 as follows: 




85. The DNA polymerase of claim 10 wherein said mutation in Region II is selected from the 
group consisting of: a leucine to hilptidine mutation at a site corresponding to L408 of SEQ ID 
NO: 2; a leucine to phenylalanine mutation at a site corresponding to L408 of SEQ ID NO: 2; a 
proline to leucine mutation at a sitelcorresponding to P410 of SEQ ID NO: 2. 

86. The DNA polymerase of clainKlO, said polymerase fiirther comprising an alanine to 
threonine mutation at a site corresromding to A485 of SEQ ID NO: 2. 

87. The DNA polymerase of claim l6 that has reduced discrimination against a non- 
conventional nucleotide selected fromlthe group consisting of: dideoxynucleotides, 
ribonucleotides and conjugated nucleosides. 



88. An isolated recombinantYDF-3 DnK polymerase that comprises an alanine to threonine 
^ mutation at a site correspondinVto A48Mof SEQ ID NO: 2. 



REMARKS 

As a result of this amendment, claims 1-3 and 5-47 and 85-87 are pending. Claims 4 and 
1 1 and non-elected claims 48-84 are canceled without prejudice. Claims 1-3, 5, 46 and 47 are 
allowed. New claims 85-88 are added. New claims 85-87 are added to re-capture material 
removed from claims 12-14 by amendment to parent claim 10, and therefore add no new matter. 
New claim 88 is supported throughout the specification and therefore adds no new matter. 
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Objection to the Specification 

The specification is objected to because page 23, Hne 5 recites "the conventional 
deoxynucleotides dATP, dCTP, dGTP and TTP." The Office Action states that it is beheved that 
appUcants intended that "TTP" be "dTTP/' and requests appropriate correction. Applicants 
submit that TTP and dTTP are alternative ways of referring to the same molecule. The 
Examiner's attention is directed to the accompanying copy of page 957 of the 2001 Sigma- 
Aldrich chemical catalog (Exhibit A), which shows that thymidine-5' -triphosphate (Catalog No. 
T 0251) is correctly referred to as both dTTP and TTP. AppUcants therefore request the 
withdrawal of the objection to the specification. 

Rejection under 35U.S.C. $112, first paragraph 

Claims 6-15 are rejected under 35 U.S.C. §1 12, first paragraph for lack of written 
description. The Office Action states that Thermococcus JDF-3 is essential to the claimed 
invention and that the claimed organisms are not fiiUy disclosed nor have they been shown to be 
publicly available. Applicants respectfully disagree. 

Applicants submit that Thermococcus strain JDF-3 is not essential to the claimed 
invention. Applicants have provided polynucleotide and amino acid sequences for wild-type 
JDF-3 polymerase and specified sites and substitutions for a number of mutants satisfying the 
limitations of claims 6-15. This is all that is required to satisfy the written description 
requirement. Applicants submit that it is not necessary to have access to Thermococcus strain 
JDF-3 in order to practice the full scope of these claims, and therefore respectfiilly request the 
withdrawal of this rejection under §112, first paragraph. 

Claims 6-45 are rejected under 35 U.S.C. §112, first paragraph for lack of written 
description on separate grounds fi:"om the deposit issue discussed above. The Office Action 
states that the specification only provides representative species encompassed by the claims 
wherein the mutant polymerase is Thermococcus JDF-3 and the mutation is selected from the 
specified residues of SEQ ID NO: 2. The Office Action states that the mutations described "are 
not representative of the genus of mutations claimed which encompasses any and all mutations 
of any Family B or Thermococcus species JDF-3 polymerase which results in a decrease in 3' to 
5' exonuclease activity or a reduction in discrimination against non-conventional nucleotides." 
The Office Action also states that there is no disclosure of any particular structure to 
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function/activity relationship in the claimed genuses or structural characteristic other than a 
decrease in 3' to 5' exonuclease activity or a reduction in discrimination against non- 
conventional nucleotides for which no predictability of structure is apparent, concluding that the 
specification fails to make clear that applicants were in possession of the claimed invention at the 
time of filing. Applicants respectfiiUy disagree. 

The Written Description Guidelines state that the scope of the claims determines the level 
of detail of the written description which must be provided to support them. The specification as 
originally filed must convey clearly to those skilled in the art as of its filing date, that the 
applicant has invented the subject matter later claimed. In order to meet the written description 
requirement, there must be correspondence between the language of the claims and the 
specification. The correspondence does not have to be literal, but must be commensurate in 
scope. For example, if a claim covers a genus of biological materials, then the specification must 
describe a sufficient number of the species within the genus to convey with reasonable clarity to 
those skilled in the art that the inventor was in possession of the entire genus at the time the 
patent application was filed, or must provide guidance to permit those skilled in the art to 
understand what is covered by the claims. 

Under the written description requirement of §1 12, first paragraph, the application must 
be reviewed in its entirety to understand what the applicant has described as the essential features 
of the invention (see, e.g., Wang Labs, v. Toshiba Corp. , 993 F.2d 858, 865, 26 U.S.P.Q.2d 
1767, 1774 (Fed. Cir. 1993). For genus claims, the written description requirement may be 
satisfied through a sufficient description of a representative number of species by (a) actual 
reduction to practice, (b) reduction to drawings, or (c) disclosure of relevant identifying 
characteristics, e.g., structure of other physical/chemical properties, by functional characteristics 
coupled with a known or disclosed correlation between fiinction and structure, or by a 
combination of these sufficient to show possession of the claimed genus. The Written 
Description Guidelines define a "representative number of species" as sufficient species to be 
representative of the entire genus. A "representative number" depends on whether one skilled in 
the art would recognize that the applicant was in possession of the necessary common attributes 
or features of the elements possessed by the genus in view of the species disclosed. 



Applicants submit that the specification as a whole describes the claimed invention in 
temis that convey to one skilled in the art that Applicants were in possession of the claimed 
genus at the time the patent application was filed. First, Applicants submit that, contrary to the 
characterization in the Office Action, the specification describes species beyond simply 
Thermococcus JDF-3 polymerase mutated at specified residues of SEQ ID NO: 2. The 
specification provides a listing of 55 Family B polymerases and literature references describing 
them (Table I), as well as accession numbers for sequence information for 17 different Family B 
polymerases. Applicants submit that it was known in the art that multiple sequence alignment is 
a means of evaluating which domains of a protein are likely to have fimctional significance, and 
that such alignment had been performed for a number of Family B DNA polymerases (see, e.g., 
Braithwaite and Ito, 1993, Nucl. Acids Res. 21: 787-802, and Wong et al., 1988, EMBO J. 7: 37- 
47, cited in the specification and incorporated therein by reference; EXHIBITS B and C). 
Applicants submit that alignments and knowledge of conserved regions permit one skilled in the 
art to identify amino acids or regions in a given Family B protein that correspond to amino acids 
or regions identified as critical functional determinants in Thermococcus JDF-3 DNA 
polymerase. For example, the specification teaches that Family B polymerases have six 
conserved structural Regions, numbered I through VI (page 7, lines 20-21), and that Region II 
has similar structural attributes to the nucleotide binding region of Family A polymerases, 
including the critical positioning of a tyrosine residue (page 7, line 21 to page 8, line 4). Thus, 
functional information on particular regions of Thermococcus JDF-3 Family B DNA polymerase 
provides a correlation between structure and anticipated fimction of the corresponding region in 
each of the Family B polymerases. The specification also provides detailed methodologies to 
assess the impact of a given mutation on nucleotide discrimination and 3' to 5' exo activity. 
Applicants describe and reduce to practice mutants of JDF-3 polymerase in the subject 
specification (see page 51, line 9 to page 54, line 15 and Examples IC, IP, IQ and Tables V and 
VI), thus establishing a structure/fimction correlation between sites to mutate and the resulting 
effect of those mutations on the fimction of the polymerase. This structure/fimction correlation 
permits the identification of regions to mutate and types of mutations to make in other Family B 
polymerases in order to achieve the claimed invention using other Family B polymerases. 
Applicants submit that the specification thus provides a number of species that is sufficient to 
convey to one skilled in the art that Applicants were in possession of the fiill scope of the 
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invention at the time of filing. The methodologies for assessing function of a given Family B 
DNA polymerase mutant also permit the skilled artisan to determine whether a given mutant 
fulfills the limits of the claims. 

As another example of the structure/function correlations provided by the specification, 
the specification also describes the relationship of Family B polymerase Region III (or Motif B) 
to nucleotide recognition, including description of the Region III consensus sequence structure 
KX3NSXYG and its functional relationship to the KX3(FA')GX2YG motif in hehx O of the 
Family A polymerases. The specification points out that the O helix of Family A polymerases 
plays a role in ddNTP discrimination in Family A polymerases (page 9, line 20 to page 10, line 
9). Further, the specification describes the effects of site-directed mutagenesis of Region III of 
Vent™ polymerase and Thermococcus barosii Family B polymerase (page 10, lines 10-23). 

Applicants submit that in each instance of describing the function of various domains or 
residues in non-JDF-3 polymerases, the specification provides a correlation between the specific 
residues or regions of the non-JDF-3 Family B polymerases and the corresponding residues or 
regions in the JDF-3 Family B DNA polymerase. For example, in describing mutagenesis 
studies on Vent™ polymerase, the specification states that the studies "targeted an alanine 
analogous to A485 of the Thermococcus species JDF-3 DNA polymerase" (page 10, lines 10- 
11). As another example, this time referring to mutagenesis studies of Region II of Vent™ 
polymerase, the specification states "site directed mutagenesis of VENT™ DNA polymerase 
demonstrated that three mutations at Y412 {which corresponds to JDF-3 DNA polymerase Y409) 
could alter nucleotide binding" (page 9, lines 15-17; emphasis added). Thus, Applicants submit 
that the specification provides structure/function relationships for regions of Family B DNA 
polymerases that are reasonably applicable to all Family B polymerases. 

Applicants submit that the structure/function relationships described in the specification 
between regions of other polymerases (both Family B and non-Family B) and regions of the 
JDF-3 Family B DNA polymerase, combined with knowledge in the art regarding alignment and 
correspondence of aligned regions, are sufficient to convey to one skilled in the art that 
Applicants were in possession of the claimed invention. Specifically, the specification describes 
25 mutant clones of exo' JDF-3 Family B DNA polymerase (representing at least 4 different 
individual mutations covering both Regions II and III) and their relative nucleotide 
discrimination (see Tables V and VI). These mutants, together with the provided description of 
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structure/function relationships of Family B Regions II and III, the provided methods of testing 
mutants for reduced discrimination, and knowledge in the art regarding alignments and 
functional regions of Family B DNA polymerases provide adequate written description for the 
genus of isolated recombinant Family B DNA polymerases having reduced discrimination 
against non-conventional nucleotides wherein the polymerase has a mutation in Region II, as 
claimed in independent claim 10. For the same reasons, the specification provides adequate 
written description of the genus of recombinant Family B DNA polymerases from Thermococcus 
species JDF-3 that are 3' to 5' exonuclease deficient as claimed in independent claim 6. Also for 
the same reasons, the specification satisfies the written description requirement for the genus of 
isolated recombinant Family B DNA polymerases comprising an alanine to threonine mutation at 
the site corresponding to A485 of SEQ ID NO: 2 or a mutation at a site corresponding to L408, 
S345 or P410 of SEQ ID NO: 2, wherein the DNA polymerase has reduced discrimination 
against non-conventional nucleotides relative to the wild- type form of that polymerase, as 
claimed in independent claim 16. Because these independent claims are described in a maimer 
meeting the written description requirement with respect to the genus claimed, it follows that the 
requirement is also satisfied with regard to their dependent species claims. 

With regard to the scope of written description. Applicants wish to finally note that the 
claims reciting particular amino acid positions for mutation (e.g., claims 7-9 and 12-13, among 
others) recite not specific mutations of the JDF-3 Family B DNA polymerase of SEQ ID NO: 2, 
but mutations of amino acids "corresponding to" specific amino acids thereof These claims 
therefore recognize and set forth the principle that corresponding structures or amino acids from 
this JDF-3 Family B DNA polymerase are appHcable to other Family B DNA polymerases. 
Therefore, Applicants submit that the specification, of which the originally filed claims are a 
part, clearly describes Family B DNA polymerases in a scope broader than specific mutants of 
SEQ ID NO: 2, and in fact broad enough to encompass the full scope of the claims. 

If the Examiner does not agree that applicants have described a sufficient number of 
species to fulfill the written description requirement for the genus, Applicants respectfully 
request that the Examiner provide Applicants with a number of species that would fulfill the 
written description requirement for the genus and to provide citations to case law supporting the 
Examiner's position. In view of the above, Applicants respectfully request the withdrawal of the 
written description rejection of claims 6-45. 
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Rejection under 35 U.S.C. $112, second paraRraph 

Claims 10-45 are rejected under 35 U.S.C. §1 12, second paragraph as being indefinite for 
use of the term "non-conventional nucleotides." The Office Action states that "while it is clear 
that dATP, dCTP, dGTP and (d)TTP are considered to be 'conventional nucleotides' it is unclear 
what other nucleotides, if any are also considered to be "conventional." Applicants respectfully 
disagree. 

Applicants submit that the specification defines "non-conventional nucleotide" on page 
25 of the specification as referring to "a) a nucleotide structure that is not one of the four 
conventional deoxynucleotides dATP, dCTP, dGTP and TTP recognized by and incorporated by 
a DNA polymerase, b) a synthetic nucleotide that is not one of the four conventional 
deoxynucleotides in (a), c) a modified conventional nucleotide, or d) a ribonucleotide (since they 
are not normally recognized or incorporated by DNA polymerases) and modified forms of a 
ribonucleotide" (emphasis added). Applicants submit that this definition makes it clear that 
according to the invention and with respect to DNA polymerases, there are only four 
"conventional" nucleotides, namely dATP, dCTP, dGTP and TTP. In referring to "the four 
conventional nucleotides," the definition leaves no room for more than those four conventional 
deoxynucleotides listed. Therefore, there is no ambiguity in the term "non-conventional 
nucleotides" as it (and the conventional nucleotides) is defined in the specification. Applicants 
respectfully request that the §112, second paragraph rejection of claims 10-45 be withdrawn. 

Rejection under 35 U.S.C. $ 102(e) 

Claims 10, 1 1, 14, 15 and 44 are rejected under 35 U.S.C. § 102(e) as anticipated by Riedl 
et al., U.S. Patent No. 5,882,904. The Office Action states that Riedl et al. teaches a mutant 
Thermococcus barossi DNA polymerase with reduced 3' to 5' exonuclease activity and reduced 
discrimination against dideoxynucleotides or ribonucleotides relative to the wild type. Applicants 
respectfully disagree. 

Applicants submit that Riedl et al. does not teach an isolated recombinant Family B DNA 
polymerase having reduced discrimination against non-conventional nucleotides, wherein the 
DNA polymerase has a mutation in Region II, as required by amended claim 10. Specifically, 
Riedl et al. does not teach a Family B DNA polymerase mutated in Region E. Applicants submit 
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that the language of the amendment "wherein the DNA polymerase has a mutation in Region IF' 
is supported in the specification at, for example, page 52, lines 19-23. The limits of the regions 
of the Family B DNA polymerases, including Region U are set out in the Braithwaite and Ito and 
Wong et al. references (Exhibits A and B) cited in the specification and incorporated in the 
specification by reference. The central consensus element of Region II of the Family B DNA 
polymerases is also described in the specification at page 52, line 20. 

Applicants submit that Riedl et al. teaches only Thermococcus barosii Family B DNA 
polymerase mutants bearing mutations in Region III, covering amino acids 488-493, and the exo 
mutation at amino acids 141 and 143. The reference does not teach any mutation in Region II, as 
required by claim 10 as amended. Therefore, applicants submit that Riedl et al. does not 
anticipate the invention of claim 10 and its dependents 1 1, 14, 15 and 44. AppUcants 
respectfully request withdrawal of the § 102(e) rejection as applied to these claims. 

In view of the above, Applicants submit that all issues pertinent to patentability raised in 
the Office Action have been addressed herein. Applicants therefore respectfully request 
reconsideration of the claims. 



Respectfully submitted. 



Date: 





Kathleen M. Williams 
Registration No, 34,380 
Palmer & Dodge, LLP 



111 Huntington Avenue at Prudential Center 
Boston, MA 02199-7613 



Tel: (617)239-0451 
Fax: (617)227-4420 
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Version of amended claims marked to show changes: 

6. (Amended) An isolated recombinant Family B DNA polymerase from Thermococcus species 
JDF-3 that is 3' to 5' exonuclease deficient. 

10. (Amended) An isolated recombinant Family B DNA polymerase having reduced 
discrimination against non-conventional nucleotides, wherein said DNA polymerase has a 
mutation in Region II . 

12. (Amended) The DNA polymerase of claim 6 [or 10] wherein said DNA polymerase further 
comprises a mutation selected from the group consisting of: a leucine to histidine mutation at a 
site corresponding to L408 of SEQ ID NO: 2; a leucine to phenylalanine mutation at a site 
corresponding to L408 of SEQ ID NO: 2; a proline to leucine mutation at a site corresponding to 
P410 of SEQ ID NO: 2; and an alanine to threonine mutation at a site corresponding to A485 of 
SEQ ID NO: 2. 

14. (Amended) The DNA polymerase of claim 6 [or 10] that has reduced discrimination against 
a non-conventional nucleotide selected from the group consisting of: dideoxynucleotides, 
ribonucleotides and conjugated nucleotides. 

85. (New) The DNA polymerase of claim 10 wherein said mutation in Region 11 is selected from 
the group consisting of a leucine to histidine mutation at a site corresponding to L408 of SEP 
ID NO: 2: a leucine to phenylalanine mutation at a site corresponding to L408 of SEO ID NO: 2; 
a proline to leucine mutation at a site corresponding to P410 of SEQ ID NO: 2. 

86. (New) The DNA polymerase of claim 10, said polymerase further comprising an alanine to 
threonine mutation at a site corresponding to A485 of SEO ID NO: 2. 

87. (New) The DNA polymerase of claim 10 that has reduced discrimination against a non- 
conventional nucleotide selected from the group consisting of dideoxynucleotides. 
ribonucleotides and conjugated nucleotides. 
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88. An isolated recombinant JDF-3 DNA polymerase that comprises an alanine to threonine 
mutation at a site corresponding to A485 of SEQ ID NO: 2. 
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EXHIBIT A 



PRODUCT 
NUMBER 



I PRODUCT 

US $ I ^^UMBER 



T OF COA/IPOU 




THYMIDINE 3'-M0N0PH0SPHATE 

T8018 Ammonium Salt 

Enzymaticaify prepared 

M0705^89'3} C,oHi5N20sP 
FW 322.2 (for free acid) 



25 mg 
100 mg 



50.50 
141.20 



T1008 



T3512 



DIsodium Salt 
Chemically prepared 

[W83209h8] 
CioHiaNAPNas FW 366.2 



25 mg 
100 mg 



37.80 
117.901 



T2807 



Sodium Salt 
Enzymatlcaify prepared 

FW 322.2 (for free acid) 



25 mg 
100 mg 



50.50 
141.20 



I T 6632 



THYMIDINE 5'-M0N0PH0SPHATE 

(Thymidylic acid; TMP; dTMP) 




^ , us$ 

THYMIDINE PHOSPHORYUSE ' 

mi^idine. orthophosphate deoxynbosyttransferase; 
Activity: Minimum 500 units per ml 

nf"iJ.5L^""*°": ^" *=°"vert 1.0 nmole each 
of tiiym,d,ne and phosphate to thymine and 2^eoxy- 

Jf^f^''«rtchia coli; 500 units 35.40 

Sni^r'^'r^X^i..- 1,000 units 58.90 

»Sl^hJ;° !i?:P.°*^*''"'" 5,000 units 196.10 
phosphate containing 2 mM 

Xmin;°^^ ^"^^ 

From Escherichia coll 
Solution In 0.5 M potassium 
phosphate containing 2 mM uracil, 0 02% 
sodium azide and bovine serum albumin. 
Asepticaliy filled 

10 mg 23.40 
25 mg 43.00 
50 mg 71.30 
250 mg 278.90 
Shipped in dry ice 



5,000 units 254.90 



I Sodium Salt 

Approx. 97% 

n8423-43-3] CioHi7N20uP3 
FW 482.2 (for free acid) 
R: 23/24/25-36/37/38 .q; 45-26-36-22 

THYMIDINE-5'-TRIPH0SPHATE.[METHYL.^H] 

See: Radiochemicals Page 2] 48 



T9758 



T7004 



Free Acid 
98-100% 

[36507-1] 
FW 322.2 



CioHisNzOsP 



50 mg 
250 mg 
Ig 



THYMIDYUCACID 

See: Thymidine 5^-Monophosphate Pqqe 957 



14.60 
51.70 
142.30 



DIsodium Salt 
Minimum 99% 
Also available as part of a kft. 

See: Standards and Controls SeC' 
tionPoge2/55 
[3343062-5] C,nH.,N,Q.PN;.. FW 366.2 



THYMIDYLYL(3 
It 7266 ADENOSINE 

Ammonium Salt 

[61845-39-4] C^,,f^^,^ 
acid) 



^5')-2'-DE0XY. 10 mg 83.70 



FW 554.4 (for free 



100 mg 
250 mg 
Ig 



12.40 
24.20 
66-30 



|T3508™S^''-=''-2'-^=«^- 51.9-5 
Ammonium Salt 

[6] 845-38-3] C.oH..N .O»P » NH, FW 548.4 



THYIWIDINE MONOPHOSPHATE. CYCUC 

See: Thymidine 3':5'-Cyclic Monophosphate 

rage 956 



T3883TrSr'''-''-'«'"^- '""-^ 229.50 
Ammonium Salt 
Approx. 95% 

[10832]Oa2] C.nH,.N .Q,.P>NH, FW 588.5 



25 mg 
100 mg 
250 mg 



29.30 
80.80 
177.90 



T^,^™^'"*'*^^ 5'-M0N0PH0SPHATE 
L4510 p-NITROPHENYL ESTER 
Sodium Salt 
Approx. 98% 

A sensitive chromogenic substrate for venom 
phosphodiesterase. It is not hydrolyzed by bovine 
spleen phosphodiesterase. 
[98] 79-] 0-3] CisHnNsOioPNa FW 465.3 



T9625'rSK^u'D^^^^^ «3.20 
4.Morpholine-N,N.dicyclohexylcarboxamidine 
Approx. 98% 

^R^7Zowo?^^^^ 684.8 
_ R. 20/21/22-36/3 7/38 5:26-36 



1 Ammonium Sah 

[]969-54^] C,nH..N.O.,P *NH, FW 563.5 



5 mg 
25 mg 



56.20 
196.30 



I T iifi«"!?"'S>H3^5')THYIWiDYLYL- 1 mg 41 20 
T6633 (3'-*5').2'-0E0XYCYnDiNE 
I ^ DIsodium Salt 

[1083478&0] C^gHa^NzOisPpNa, FW 879.6 



THYMINE 

I T 0376 (2,4^ihydroxy-5-methylpyrimidine; 
5-MethyiuraciI) 
Minimum 99% 
Also available as part of a kit 

See: Standards and Controls Section Page 2]62 
See o/5aJissue Culture Media and Reagents 
Pope ]776 and Page 187] 
[65-7]-4] CsH eN^O, FW 126.1 



5g 
10 g 
25 g 
100 g 



8.40 
12.80 
24.90 
73.50 



THYMINE-2-i*C 

See; Radiochemicals Section Page 2 148 



THYMINE-METHYL-*H 

See: Radiochemicals Section Page 2] 48 
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INTRODUCTION 

This is an update of an earlier ccxnpilation and alignment of DNA 
polymerase sequences (Ito and Braithwaite, 1991). As in the 
previous compilation, we attempted to copapile complete 
sequences, to facilitate the identification of conserved and viable 
regions of the DNA polymerases (1), This update includes, for 
the first time, three DNA polymerase sequences from Archaea 
(2); two new members of the Family A DNA polymerases; and 
19 new members of the Family B DNA polynaerases. In addition, 
we included nucleases that have related amino acid sequences 
to E.coU DNA polymerase I, and the sequence of E,coli DNA 
polymerase in (esubunit) was aligned to Family C due to its 
homology to Bacillus subtilis DNA polymerase HI. 

As in the previous compilation (1), Family A DNA 
pol3nnerases are named for their homology to the product of the 
polA gene specifying E,coli DNA polymerase I; Family B DNA 
polymerases are named for their homology to the pnxbict of the 
polB gene encoding E.coli DNA polymerase H; and Family C 
DNA polymerases are named for their homology to the produa 
of the/ToiC encoding E,coU DNA polymerase IH alpha subunit. 

Table 1 summarizes the molecular weights and isoelectric 
points of each DNA polymerase and nuclease. Table 1 also serves 
as a reference guide to the sequences shown in Figures 1 A, IB, 
and IC. Since no new sequences were published for the Fanuly 
X DNA polymerases 05-like), we have excluded them ftom this 
compilation. 



SEQUENCE ALIGNMENT 

The multiple alignments of the amdno acid sequences for this 
update were performed in most cases by merely adding on to 
our original alignments (1) where possible. Due to the large 
number of sequences added to the alignment for Family B we 
have changed the original alignment in some areas between 
obvious blocks of conserved sequences. The newer sequences 
were added by alignmg each to tb& closest related sequence 
already aligned, or in many cases to the closest related group 
f sequences already aligned. A more recent addition to the. 
UWGCG (University of Wisconsin Genetic Computer Group) 
program package, PILEUP, a multiple alignment program, was 
used extensively to try and locate si^iificant homology in groi^>s 



of closely related sequences. These newly fonned groups of 
highly related sequences were then regapped to conform with 
tiie entire alignment based upon the previous alignment of those 
sequences in the new group from the original alignment. As in * 
die previous paper, all iie^final adjustments had to be made by 
eye and, as stated above, in Family B the added sequences led 
to some improvements to the original alignment that became 
evident to the eye when they were being combined with the entire 
alignment by hand. 

GENERATION OF PHYLOGENETIC TREES FOR THE 
DNA POLYMERASE DOMAINS 

Usii^ Felsenstein*s PHYLIP program package (71), specifically 
the programs named in the outline below, we generated 
phylogenetic trees for the 9 Family A DNA polymerases (Figures. 
2 A and 2B) and for the 47 Family B DNA polymerases (Figures 
3A and 3B). The trees for Family A were created fiom the 
alignment in Figure 1 A using the most conserved regions found 
at the foUowing positions: 798 to 814, 877 to 998, 1047 to 1090, 
1104 to 1123, 1131 to 1158, 1175 to 1206, 1236 to 1251, 1284 
to 1305, 1322 to 1340, and 1365 to 1379, These conserved 
regions were recombined and 100 bootstrap samples were 
generated using SEQBOOT program. Using the DNADIST 
program, we turned the samples into distance matrices using the 
Kimura-2 parameter method. The resulting matrices were then 
input to the NEIGHBOR program using the UPGMA method 
to produce approximately 100 trees. Finally those trees were 
reduced to a single tree using the CONSENSE program. This 
final tree was then plotted for publication using two different 
metiiods. The trees in Figures 2A and 3A were created by the 
DRA WGRAM program setup to produce a phenogram type ttee 
and the trees in Figures 2B and 3B were created by the 
DRAWTREE program. The trees for Family B were created 
fiom the alignment in Figure IB, according to the same 
procedure, using the most conserved regions found at the 
foUowing positions: 1407 to 1760, 1885 to 1901, 1956 to 1990, 
2081 to 2100, 2181 to 2210, and 2280 to 2320. The Family B 
DNA polymerases can be subdivided into two subfEuniiies, the 
protein-primed DNA polymerase subfiamily and the RNA^mmed 
DNA polymerase subfunily. 



* To whom correspondeiice should be addressed 
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Tab!el.Ttem«„femaksand8UbcIassffi^ star (•) ^J^^^^ 

or tuKka^C^^S using the Ptotctasort piogm^ 

pdyme««*. Theism 

Quxnben is ifte table only represent this short sequence 



Mol. Wt. 



Isoelectric pL Reference 



A. Fam^r 4 ON A polymerases 

1. BadcHitf UNA polymerases 
a) E,coU DNA polymerase I 
t>) &n^P«ococoti piaunoruae DNA polymerase I 

c) Jhemm aquaticus DNA polymerase I 

d) Themugftavta DNA polymerase I 

2. Bacter^hosB DNA polymerases 

a) T3 Ma polymerase 

b) T7 DWA polymerase 

c) Spol ONA polymerase 

d) tJNA polymerase 
9* ^Btocbwidrial DNA polymerase 

Yeast tsiitodiondrial DNA polymerase (MDPl) «jvtfi««ac 1 

4. 5' toy E«mudeaseswtthh<OT>^^ 

a) T4 mue H (gp 33.2) 

b) T5 fijsonudcase (gp D15) 

c) T? Rttmudease (gp 6) 

B. FamUy » DNA polymerases 

1. Bacterlnf usX polymerase 
E.COU OSA polymerase H 

2. Bactcrt«phagc DNA polymerases 

a) PRDI DNA polymerase* 

b) 1^29 DNA polymerase* 

c) M2 tJHA polymerase* 

d) T4 mA pc^ymerasc 

3. Archacliactcrial DNA polymmses 

a) Thermococcus UtontUs DNA polymerase (Vem) 

b) Pyrmxcus jwiosus DNA polymerase 

c) Sulffittiims solfiitaricus DNA polymerase 

4. EukaryMIc CeU DNA polymerases 
(1) DNA polymerase alpha 

a) HMman DNA polymerase (alpha) 

^) S.veresfisiae DNA polymerase I (alj^) 

c) S,{ffmb€ DNA poiymerase I (alpha) 

d) Drosophila melanogaster DNA polymerase (alpha) 

e) Trypanosoma brucei DNA polymerase (alpha) 
O) DNA polymerase ddta 

a) HMman DNA polymerase (delta) 

b) Btivinc DNA polymerase (delta) 

c) Sst erevisioe DNA polymerase m (delta) 

d) S.fmmbe DNA polymerase IH (delta) 
«) PtttJimodiumfiddparum DNA polymerase (delta) 

(3) DNA polymerase epsilon 
S^cereyiniae DNA polymerase n (epstton) 

(4) Other cukaxyotic DNA polymerases 
S,€€mii,lae DNA polymerase Rev3 

UNA potymerases 

a) Herpes Simplex virus type 1 DNA polymerase 

b) Equine herpes virus type 1 DNA polymerase 

c) Vaiicolla-Zostcr virus DNA polymerase 

d) Epaittin-Barr virus DNA p<rtymerase 

e) fierifesvims saimin DNA polymerase 

f) Human cytomegalovirus DNA polymerase 

g) Murine cytcMnegalovirus DNA polymerase 

h) Human herpes virus type 6 DNA polyrnerase 

i) Channel Catfish vims DNA ptdymerase 
j) ChlQrtslU virus DNA polymerase 
k) Fowlpox virus DNA polymerase 
I) VacQlnta virus DNA polymerase 
^) Chorlstcmeura biennis DNA polymerase 
n) Autographa califoimca nuclear polyhedrosis 

vimH (AcMNPV) DNA polymerase 
o) Lymtintria dispar nuclear polyhedrosis virus 
I^NA polymerase 



5. 



928 
877 
832 
831 

829 
704 
924 
648 

1254 

305 
291 
348 



783 

553 
575 
572 



774 
775 
882 



1.462 
1,468 
1,405 
U505 
1,339 

1,107 
1,106 
1,097 
1,084 
1,094 

2,222 

1.504 

1.235 
1,220 
1,194 
1.015 
1,009 
1,242 
1,097 
1,012 
985 
913 
988 
937 
964 

984 

1,013 



103,117 
99.078 
93,909 
93,783 

94,410 
79.691 
106,808 
72,561 

143,479 

35,558 
33,448 
40,126 
> 



90,020 

63,336 
66,714 
66,423 
103,609 

89.913 
90.112 
101,332 



165,859 
166.776 
159,348 
171,167 
151,611 

123.634 
123,707 
124.618 
123.211 
126.883 

255.669 

172.956 

136.547 
135.955 
134.047 
113.417 
113,934 
137,101 
123,573 
115,819 
113,468 
104,955 
116,658 
108,564 
114,818 

114,337 

115,921 



5.37 
4.78 
6.38 
6.00 

6.19 
6.45 
5.34 
8.50 

9.23 

9.00 
5.12 
4.54 



6.85 

6.68 
8.83 
7.69 
6.20 

8.29 
7.92 
9.72 



(3) 
(4) 
(5) 
(6) 

(7) 
(8) 
(9) 
(10) 

(11.12,13) 

(14.15) 

C7,16) 

(4.8) 



(17) 
(18.19) 

ao) 

(21) 
(22) 

(23) 
(24) 
(25) 



5.71 




6.14 


(27) 


6.85 


as) 


8.22 


(29) 


6.39 


(30) 


6.94 


(31,32) 


7.52 


(33) 


7.96 


(34) 


7.63 


(35) 


8.76 


(36) 


6,92 


(37) 


8.86 


(38) 


7.35 


(39) 


6.67 


(40) 


7.80 


(41) 


7.38 


(42) 


7.31 


(43) 


7.25 


(44) 


6.68 


(45) 


7.11 


(46) 


7.98 


(47) 


6.66 


(48) 


8.11 


(49) 


7.50 


(50) 


7.95 


(51) 


8.35 


(52) 


9.08 


(53) 



p) Adenovirus-2 DNA polymerase* 
q) Adeiiovinis-7 DNA polymerase* 
rt Adenovinis-l2 DNA polymerase* 

a) S-l maize DNA polymerase* 

b) kaUh neurospom intermedia DNA polymerase* 

c) pAE Ascobohis inanersus DNA polymerase* 

d) pCLKl C3aviceps purpurea DNA polymerase 

c) manmhar neurospom crassa DNA polymerase 

f) pEM i4^flricwy bUorqids DNA polymerase* 

^ pGKLl Klu^ronPfces lactis DNA polymerase* 

h) pGKU KUfyveron^ces Uxais DNA polymerase* 

i) pSKL Sacckaromyces kluyyeri DNA polymerase* 

C. Family C DNA polymerases 

1. Bacteid repHcathre DNA polymerases 

a) E.caU DNA polymerase m a subunil 

b) S. typhimurium DNA polymerase m a subumt 

c) Bfldi/U5 subtiUs DNA polymerase DI 
'> Exoti dnoQ (MutD) 

£.co/i DNA polymerase HI « subumt 
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1.056 
1,122 
1.053 

917 
970 
U02 
1.097 
1.021 
/f797 
995 
994 
999 



120,431 
12S.648 
120,863 

105,935 
112.902 
138,279 
126.627 
119.074 
+91.922 
116,345 
117,560 
117>W 



6.65 
6.73 
6.86 

8.62 

9.71 

10.10 

8.76 

9.62 

+8.24 

8.04 

8.33 

9.79 



(54) 
(55) 
(56) 

(57) 
(58) 
(59) 
(60) 
(61) 
(62) 
(63) 
(64) 
(65) 



1160 
1160 
1437 


129.903 
130,118 
162,648 


5.04 
5.05 
5.23 


ill 


243 


27,099 


5.68 


(69.70) 
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J Biol Chem. 264, 4255 -4263. ^ _ . 
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f1gu« 3A. Phylogeneac phenogram «e pro<Iu«4 ft«n to jUgnrnj^t of {"l^f -g-DNA^^J^f^^^ SLSS'^SSSftTo?^ 
bSda of the DNApolynwase domain: 1407-1760, 18M-1901. ^^^^J^;^ 2100,2181 -uiu. 
Family B DNA polymerases jaodnced as in Figure 3A plotted by a difierent mefljod. 
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INTRODUCTION 

More than 40 different DNA polymerases, including some 
putative DNA polymerase sequences deduced from nucleotide 
sequence data, have recently been reported (1 -39). The amino 
acid sequences of these DNA polymerases have been aligned and 
partial homologous regions identified by many investigators 
(2-4,9,10,12-25,27-36,42-51). Based on the segmental 
amino acid sequence similarities, DNA polymerases have been 
classified into two major groups; E. coU DNA polymerase I-Type 
and eukaryotic DNA polymerase a-Type (14,44,47.48.51), or 
family A DNA polymerases and family B DNA polymerases 
(4,9,50), As the number of DNA polymerase sequences 
increases, the classification of DNA polymerases becomes 
increasingly ambiguous. For example, DNA polymerase delta 
of yeast was shown to have amino acid sequence similarity to 
the a-Type DNA polymerases (17). It has become necessary to 
establish a unified classification of DNA polymerases. Here we 
pr pose to classify DNA polymerases into fiamilies A, B, and 
C (Figure 1 : A, B, and Q, according to the amino acid sequence 
homologies with E, coU DNA polymerases I, II, and III, 
respectively. As new and different prokaryotic and eukaryotic 
DNA polymerases are identified, the number of families can 
easily be expanded by using additional letters of the alj^abet (i.e., 
D, E. etc.). 

The bacterium £. coH (strain K12) contains three distinct DNA 
polymerases 1, H, and III (52). £. coli DNA polymerase I, the 
first DNA polymerase discovered, is specified by the poiA gene 
(52). E, coU DNA polymerase II. encoded by the pof^-jganc, 
was recendy sequenced and foimd to be identical to the dinA 
gene, a DNA damage inducible gene whose expression is 
regulated by the SOS system in £. coU (8,53). Amino acid 
sequence alignment shows that E. coU DNA polymerase n has 
significant homology with family B (a-Type) DNA polymerases 
(8.53,54). 

E. coli DNA polymerase HI is a multisubunit enzyme encoded 
by various dna genes (55); the DNA polymerizing a-subunit 
encoded by the polC {dnaE) gene (56) and the 3' -*5' exonuclease 
performing e-subunit encoded by the dnoQ gene (57). The a- 
subunit of £. coli DNA polymerase in exhibits an extensive 
horn logy with the corresponding a-subunit of Sabnonella 
typlumwium DNA polymerase HI (35); and both show significant 
homology to Bacillus subtilis DNA polymerase HI, a single- 
polypeptide encoded by the polC gene (36). 

In summary, family A DNA polymerases are named for their 
homology to the product of the polA gene encoding E. coli DNA 
polymerase I; family B DNA polymerases are named for their 



homology to the product of the polB gene encoding £. coli DNA 
polymerase 11; and family C DNA polymerases are named for 
their homology to the product of the polC gene encoding E. coli 
DNA polymerase HI. 

The eukaryotic DNA polymerase jS, the smallest known DNA 
polymerase, does not lia>(e homology with those of any of the 
DNA polymerase families described above. Instead, DNA 
polymerase /3 has homology with terminal transferases (37). This 
i3 group we will call family X (Figure ID). The classification 
and original reference(s) for the amino acid sequences of each 
DNA polymerase are shown in Table 1. 

All of the family A DNA polymerases, except for yeast 
mitochondrial DNA polymerase I, are prokaryotic and are very 
sensitive to dideoxynucleotide inhibitors, and therefore are usefiil 
enzymes for DNA sequencing by the chain-termination method 
(58). The family A DNA polymerases are resistant to aphidicolin. 
The family B DNA polymerases are quite extensive in number 
and variety. Most of the family B DNA polymerases, if not all, 
are sensitive to aphidicolin and relatively resistant to 
dideoxynucleotide inhibitors. Most of the family B DNA 
polymerases, except for pAI2 (33) and yeast DNA polymerase 
n (16), contain the highly conserved amino acid sequence motif 
YGDTD, which has been suggested to form part of the dNTP 
binding site. Amino acid substitutions in this conserved sequence 
resulted in defects in the DNA polymerase activity without 
affecting the 3'— 5' exonuclease activity (59,60,61). The fiamily 
C DNA polymerases are major bacterial replicative DNA 
polymerases which do not have appreciable homology with those 
of family A and B DNA polymerases. B. subtilis DNA 
polymerase m is a single polypeptide that is highly sensitive to 
hydroxyphenylazouracil (62). It is anticipated that the number 
of sequenced family C DNA polymerases will increase rapidly, 
since all of the aerobic bacteria may contain a member of this 
family of DNA polymerases. 

SEQUENCE ALIGNMENT 

The 37 complete DNA polymerase sequences and 3 complete 
terminal deoxynucleotidyltransferase (TDT) sequences are listed 
in 4 groups; the family A DNA polymerases, the family B DNA 
polymerases, the family C DNA polymerases, and family X DNA 
polymerases (including TDTs). In order to limit the space needed 
for the alignment, we omitted DNA polymerase sequences that 
are very similar to th prototype DNA polymerase. The DNA 
polymerases not shown include: herpes virus type-2 (63), 
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adenow-type^S (64). bacteriophage T3 (65). and bacteriophage 
PZA(6i^ 

ACCUmCY OF SEQUENCE DATA 

me««a seq^nce ainbiguhy existed 

we canaaed the authors to obtain die updated sequence 

inlbni»m«L We found that a few publidie^ 

difer at« or more positions from their GenBank/EMBL eitfiy. 

Again, ^have communicated with the primary author to confirm 

the coosc sequences. 

The imiiiple aUgnment of die amino acid sequences was 
obtainedfiir7soies of pairwise alignments combined and adjusted 
by eye in laiwsr and larger subsets of similar sequences. The 
piocessif canbining and adjusting by eye was aided by modified 
versioDRrfdie MOTIF program (67) and the P^Sj^JI 
(68) Tte GAP and BESTFTT programs, from UWOCU^ 
(Univeow of Wisconsin Genetic Computer Group) (69), uutially 
genenUBc'ibe pairwise alignments, adjusted for maximmn 
aligmnai ifaat aUowed for a considerable number of gaps. We 
then ctmHessed these aUgnments by eye to give a mote 
contignaaaHgrnnent. TTie alignment of the sequences for optmid 

similaritrs straightforward in die areas of relatively conserved 
structrntlBt is much more arbitrary in the more varied sapience 

areas "Db al^nment of die varied areas should dierefbre be 
regaideds less than opthnal in view of the difBculties concerned 
with nnAMe alignments in these areas. 

Finalh. ie invite fiirtfier correction from readers, and welcome 
suggesffiL KVMbns and alternative alignmems. 
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DNA 



a) TS DNA { 

b) T7 DNA pcdjmicnae 

c) SpiO, DNA po^ymeisse 

Mltodumilctal DKA potjmaw 

J* 

Yeait initpdiandriftl WA poljrmeiaM (MlPi) 



(1) 
(2) 
(3) 



(*) 

(5) 
(6) 



(7) 



FunQr B 



IXNAi 

£. coir DNA pclymeiase H 
DNA 



a) PRDlDNApo^ymeiMe* 

b) ^29 DNA pd^yniHwe* 

c) M2 DNA vHyaentc* 

d) T4 DNA ^b^OMC 

DNA 



a) Hasan DNA polymenae alplia 

b) Yeast DNA pd^fneiase I 

c) Yeast DNA po^jfisaue U 

d) Yeast DNA po^naense m (delta) 

e) Yeast DNA poljfmerase Rfiv3 

VlrmlDNA 



a) HeqKS-l DNA pciljnnexase 

b) Httman cytumegakiwinis DNA polyiaeiase 

c) ^weta-Bamrinu DNA polynicraie 

d) Varicella-Zaster virus DNA polymerase 

e) FowtpoK vires DNA pctfyioeasc 

f) Vaodnia vires DNA potjnBcrase 
o) AutQgraphft caHfhrntea nuclear 

polyfaedrods virus (AcMNPV) DNA polynKrese 

b) Adenoviraft-Z DNA pd^jrmerase* 

J) Adeaiovin»-7 DNA po^rmerasc* 

j) Adenavire»-12 DNA polymciase* 

5. lenkaxyoOc Uaear DNA pfasmid encoded DNA potrnuraies 

a) S-1 maize mitDchondrial DNA polymerase* 

b) xaowooFfiycaj tafr plasmid pCKU DNA polymerase* 

c) Aftyiwwyeo laeOf ptasmid pGKL2 DNA potymcrase* 

d) Oavicqa purpurea pfaundd pCLKl DNA polymerase* 

e) AsGoboim Immeaus pfasmM pAQ DNA polymetase* 

C r^mUr C DNA po^jmcnses 

BMitcrial npUcathv DNA polrmetascs 

a) E.eo& DNA polymerase HI a si&unft 

b) Sainumata ^hiauirium DNA polymerase m a subonit 

c) SadOut sub^ DNA pbtymeme QI 

D. FunQr X DNA polpnerasea 



a) 
b) 
c) 
d) 
e) 



Rat DNA polymerase 
Human DNA polymerase 

Hnman terminal d e mymni e u L klymaD tferase (TdT) 
Bovine tenniDal deoxynudeoddyttrans&rase (T<'T) 

, ^dT) 



(«) 



(9,10) 

(11) 
(12) 
(13) 



(14) 
(15) 
(16) 
(17) 
(18) 



(IS) 
(2(0 
(21) 
(22) 
(23) 
(24) 

(25) 
(26) 
(27) 
(28) 



(29) 
(30) 
(31) 
(32) 
(33) 



(34) 
P5) 
(36) 



(37) 

(3839) 

(40) 

(41) 

(41) 



Table 1. The main fomilies and subclassificaiions of DNA polymerases. Those 
DNA pdymerases marked with a star (*) arc protein-primed DNA poJymerases. 
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Human DMA polymerase a gene expr sslon is c 11 
proliferation dependent and its primary structure is 
similar to both prokaryotic and euicaryotlc replicative 
DNA polymerases 



Scott W.Wong, Alan F,Wahl, Pau-Wliau Yuan^ 
Naoko Arai^, Barbara E Pearson, Ken-ichi 
Arai^ David Korn, Michael W.Hunkapiller^ and 
Teresa S. -F.Wang 

Laboratory of Experimental Oncology, Department of Pathology, 
Stanford Medical School, Stanfont University, Stanford, CA 94305, 
^Appticd Blosysiems, Foster City. CA 94404 and ^DNAX Research 
Institute of MoiBCuUr and Cellular Biology, Palo Alto, CA 9+304. 
USA 

Communicated by A.Komberg 

We have isolated cDNA clones encodiiig the human DNA 
polymerase a catalytic polypeptide, ^dies of the human 
DNA polymerase ct steady^tate mRNA levels in quies- 
cent cells stimulated to proUferate, or normal ceUs com- 
pared to transfonned cells, demonstrate that the 
polymerase o: mRNA, like Us enzymatic activity and de 
novo protein synthesis, positively correlates vn± cell pro- 
liferation and transformation. Analysis of the deduced 
1462-amino-add sequence reveals six regions of striking 
similarity to yeast DNA polymerase I and DNA polymer- 
ases of bacteriophages and <t>2% herpes family 
viruses^ vaccinia virus and adenovirus. Three of these 
conserved r^ions appear to comprise the functional ac- 
tive site required for deoxynucleotide interaction. Two 
putative DNA interacting domains are also identified. 
Key words: primary structure/replicative DNA-polymer- 
ases/sequence similarity/structural gene/transcription 



Introduction 

Cell proliferation and the transmission and maintenance of 
error-free genetic information from one generation lo the 
next are dependent on the mechanism of DNA replication. 
Genomic DNA replication is a complex and tightly regulated 
pracess involving the orderly coordination of many prch 
tein- protein and protein -DNA interactions (Romberg, 
1980, 1982). A key component of the chromosomal replica- 
tion apparatus is DNA polymerase a, which is generally ac- 
cepted as the principal polymerase involved in eukaryotic 
DNA replication (Koraberg, 1980, 19S2; Campbell, 1986; 
Fry and Loeb» 1986). Many lines of evidence support this 
concept: its enzymatic activity positively conelates with 
DNA synthesis during cell proliferation (Fry and Loeb, 
1986); all its specific inhibitors also inhibit DNA replica- 
tion in WW (IkBgami et ai, 1378; Fry and Loeb, 1986); 
a mutant which is temperature sensitive for DNA synthesis 
has been identified as a DNA polymerase a mutant 
(Murakami et ai , 1985); and monoclonal antibodies specific 
for DNA polymerase a inhibit DNA syndesis in permeabil- 
izcd cells or when microinjected into nuclei (Miller.M.R. 
et ai 1985; Miller ei al , 1986). The recent reports that DNA 
polymerase a pUyfi a central role in SV40 DNA replication 
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in vitro (Li and Kelly, 1984» 1985; Stillmaii and Gluzman, 
1985; Wobbc et ai , 1985; Murakami et ai . 1986), a model 
system of eukaryotic DNA replication, further underscore 
the importance of this enzyme. 

DNA polymerase a lacks the 3' -5^ prcof reading exo- 
nuclease which is required to remove mismatched nucleo- 
tides during DNA polymcrizaiion (Komberg, 1980, 1982: 
Fry and Loeb, 1986). However, another eukaryotic DNA. 
polymerase, designated 5, has been identified (Brynes etoL. 
1976), which possesses a 3' -5' exonucleasc activity and. 
like polymerase a, is sensitive to the inhibitor aphidicolir 
(Lee et al, 1984). The relationship betwc:en polymerases 
a and S is unknown. The identification of a cellular pro- 
tein, proliferating cell nuclear antigen (PCN A), required for 
efficient viral DNA chain-elongation in vitro, and the 
discovery that diis protein is able to stimulate DNA poly- 
mcrase h but not polymerase cc activity. lajses the question 
of whether there are two polymerases invoh ed in eukaryotic 
DNA replication (Bravo et al, 1987; Perlich er ai 
!987a,b). The recent finding of a cryptio proof readint: 
3' -5' exonuclease activity associated with the catalytic 
polypeptide of Drosophila embryo DNA polymerase a. 
when separated frxjm other subunits (Cotter ill et al , 1987). 
further stimulates interesting questions about the relation 
ship between these two DNA polymerases. 

Despite more than a quarter century of biochemical char 
acterization of DNA polymerase a (Fry and Loeb, 1986) 
liale is known about the reguladon of the expression of this 
essential DNA replication enzyme, which nucleotide struc- 
tural elements may be responsible for the ciill-prolifcration 
dependent expression and whether its exp1^e^Bion is the direcr 
target of cascading biochemical events imiuccd by growd t 
factors or mitogens. In addition, nothing is known about the 
structure -function relationships of DNA p)lymerase ac pro- 
tein domains required for substrate recognition* or for tho 
orderly coordination of protein - protein and protein - DNA 
interactions during chromosome replication. In an attempt 
to address these questions and to define the relationships be- 
tween DNA polymerase a and a near fall-length cDNA 
of the human DNA polymerase a catalytic polypeptide has 
been isolated < 

Comparison of the steady-state mRNA levels of quiescent 
cells stimulated to proliferate, or normal human cells com- 
pared to transformed ceUs, demonstrates that die previously 
reported increase of enzymatic activity during cell pro- 
liferation or transformation correlates witii the level of 
steady-state mRNA. Analysis of die deduced primary struc- 
ture of human DNA polymerase ot with several viral DNA 
polymerases, yeast DNA polymerase I, Escherichia coli 
bacteriophage T* and Bacillus phage <t>29 DNA polymer^ 
ase identifies six regions highly oonser\ ed among these 
polymerases. Three of these conserved domains appear to 
comprise fiinctionally active sites requited for dcoxj - 
nucleotide interaction and two otiier regions are postulated 
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Tabte I. Huniar DMA polymerase a peptide sequence analysis 



Cycle 
no- 



T19 



T23 



T24 



T25 



Res pmol 



Res pmol 



Res pmol 



Res pmol 



Res pmol 



1 

1 

3 

4 

5 

6 

7 

8 

9 

10 

11 

12 

13 

14 

15 

16 



S 
G 
Y 
S 
E 
V 
N 
L 
S 
K 



20 
54 
58 
11 
41 
46 
24 
30 
S 
15 



A 
A 
Y 
A 
G 
G 
L 
V 
L 
D 
P 
K 



92 
% 
62 
76 
40 
37 
36 
56 
44 
18 
20 
19 



G 
P 
C 

w 

L 
E 
V 
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24 
26 
5 
8 
25 
14 
19 
5 



M 
Y 
A 
F 
E 
! 

P 
D 
V 
P 
E 
K 



17 
23 
25 
24 

a 

24 
14 
8 
24 
10 
5 
8 



50 
53 
51 
22 

4a 

37 
10 
4« 
45 
22 
17 



T264 




r265 


Res pmol 


Res pmol 
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I 
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7-9 
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7.4 
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3.4 
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2,7 
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6.5 
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1.6 
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5.5 
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2.2 
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4.2 
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2,3 
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2.6 
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1.4 
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3.6 


,S 


0.4 


Q 


3.1 


R 


< < 


1 


2,4 



Sho^n are the PTH-ammo adds observed at each sequence cycle. < < indicates yidd too low for quantitaHon. 



to be DNAbdiKling domains, the preaence of these conserved 
amino acid sequences among replicative DN A polymerases 
from phyiogenetically distant species suggests they all may 
have evdvEsd from a single primordial gene. 

Results 

Pun&ation of tlw catalytic pofypepttde and protein 
sequendng 

The advancement of utilizing monodonal antibodies specific- 
aUy against human DNA polymerase a for inununoafeniiy 

38 



purification has denned the protein structure md subunit 
components of this enzyme (Wang et a!., 1984; Wong et 
ai, , 1986) . DNA polymerase a contains (i) the catalytic poly- 
peptide which, in yitro, is a family of large phosphopoly- 
peptides of 180-125 kd, previously demonstrated by cryptic 
peptide m^ing to be derivatives of the same primary stnic- 
tuie CWong et al., 1986); (ii) a 77 M phosphoprotein of 
unknown function; and (iii) two polypeptides of 55 and 49 kd 
reported to be associated with DNA primase actrviiy (Tseng 
and Ahlera, 19S3) (Figure 1 A), The catalytic jwlypeptides 
of human polymerase a were separated from associated 
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Fia 2 Human DNA polymerase a cDNA, (A) ResOicUoa map of huixian DNA polymerase a and overlappiiig cDNA clones. Thit stippied box 
represents (he coding region of human DNA polyrae^sc and the solid indicates tbe 5' and 3' non-coding tc&oa. A indfaUss Ac locations of 
each of the previouily detcnnined amiro acid sequences. The five overlapping cDNA clonei^ are pcD-KBpda, EM4b8, El-i4a, 121-12 eiui 
CB) NudeoridB sequence and deduced amiao acid sequeiicc of human DNA polymerase a. Nucleotides are numbered at the upper right and amino 
acids at the lower right. Peptide sequences derived from immunopurified human DNA polymerase a prcparaiion and used (□ deai^n oUgonucleonde 
probes as described aodcr Materials and methods are underiined with dotted lines and labeled according lo Table I Amina acid WAmbcr starts at 
methionine 1. * indicates the termination codon TAA. 
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chromosome it presented « *e left. ^^^^ ^S^^^l-S*"™^ ""^^^ 17^. 21 and X: hybrid 13-13 containing only human X 

SOcnaUc hybrids {Wang «(., 1985): C/^^-^* ^"'^"^S ^"'"J^."^^ Xp21 band |46X dd(X) (p.er-21.3:: p2i .M«cr)h 

cl,romo«.me: hybrid cJone XVIU-S4A-2a Con«un,ng •^'^J^^"''^^^Z°^5^c^^ an X/U .ramloailion. refined de>(X), Xq«r.. 
and oUKr thit* CF series hybrid clon» ccnta.mng X^I ^^^« 24 con^n^ uanslocatior. wL* dnX). XM.er- 

were ^ described under Materials and methods. 



subunits by gel permeation HPLC columns (Figure IB), TKe 
separated catalytic polypeptides were pooled and treated as 
a single entity, digested with trypsin and fractionated by 
preparative reverse-phase HPLC. The amino add sequences 
of seven peptides were determined (Table t) as described 
in Materials and methods. In aU, the sequences of 85 amino 
acids were established and used to design single, long anti- 
sense oligonucleotide probes (Lathe, 1985) by which a near 
fiilMength cDNA clone of human DNA polymerase a was 
isolated. 

Pntmry stmcture of human DNA poiymerasB a 
The 5433 nucleotides of the human polymerase a cDNA 
contain a single open reading frame coding for 1462 amino 
acids (Figure 2A and B), An in-frame initiator ATG codon 
flanked by nacleotides matching Kozak's criteria for a trans- 
lation initiation site was identified (Figure 2B) (KoEzak, 1981), 
All seven expcrimentaBy deternuned human DNA polymer- 
ase a pepddfi sequences listed in Table I are identified within 
this amino acid sequence (Table L Figure 2A and B). Bas- 
ed on the deduced amino acid sequence, the estimated M, 
of the recombinant polymerase a is 165 kd. Prinaer exten- 
sion with two synth uc oligonucleotides corresponding to 
two separate regions of the 5'-end localizes the transcrip- 



tion Stan site 295 nuckotidcs upstream from the putative 
translation initiation codon (data not shown). 

LocaUzation of the strvcturaf gans 
The human DNA polymerase a gene was prenously map- 
ped by expression to a single genetic locus on the short arm 
of the X chromosome at the junctional region of Xp2l-3 to 
Xp22,l (Wang et al, , 1985). The cDNA insert was shown 
to be X chromosome-linked by comparative genomic South- 
em hybridization widi normal male DNA (46 X Y) and DNA 
from a cell line of 4X (ka^^otyped 49 XXXXY) DNA 
(Figure 3A). Two £c£?RI-digesttd genomic DNA bands are 
observed by hybridization with a Pstl restriction fragment 
of pcI>KB;«>ia (Figure 2A), both resulting in 1:4 ratio of 
signal intensity (Figure 3A). Using this Pstl restriction frag- 
ment of pcD-KBpo/ot cDNA clone, the chromosomal local- 
ization of the DNA polymerase a structural gene was 
analyzed directly by Southern hybridization of EcoBl- 
digested genomic DNA samples from a panel of 
human -rodent hybrids containing either an tntact human 
X chromosome, different but overlapping regions of the 
human X diromosome and a hybrid clone with an interstitial 
deletion of the human X chromosome (Wang et ai , 1985)* 
Under conditions that exclude cross-hybridizadon to rodent 
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Fig, 4, Sieady-smte mRNA analyses from quiescent cells stimulated lo 
proliferate and comparison of norma] and transfbn«ed cells. 
(A) Northern hyhridizaiion imalyBis of human DNA polymerase cc 
mRNA. 10 /ig of polyaricnylated mRNA from early mid-log human 
KB celk was hybridized with 30 ng of ^^p.^abcicd HirKXlllfBamHl 
700-bp resiriction frafiment as described in Materials and methods. 
RNA sizts are given in kb and were determined by staining of parallel 
birt with RNA standard ladder marker from Bethesda Research 
Uboratory (BRL). {B> S(eady-statc mRNA from quiescent and 
proliferative celU. 25 m& total RNA isolated from normal human 
lung fibroblast (IMR90), culmred in the 0.t?& fetal calf serum (FCS) 
lor % h to effect quiescence (lane I) or stimulated to proliferate by 
activatitm with IQ% FCS for 30 h (lane 2) were hybridized with 
too Tfg of ^^P-labeled anti-sense DNA polymerase or rito-probe (5 X 
10* c.p.m./;ig) ttS described in Materials and methods, 
Autoradicgiaphy was for 5 dayrwittt intensifier screen at -1Q°C. (C) 
Steady- state mRNA comparison of irwis formed and normal human 
ceUs, 5 of poly(A)'*' mRNA from transformed cell line. MDA4. 
flane I] and from normal human prolifcradng tissue isolated from 20 
week placenta (lane 2) were compared for relative abundance of steady 
itate DNA polymerase a mRNA as described m Maierials and 
methods. 



Table II. DNA polymerase a activity in transformed and 
non'transformed human cclb 



Transformed 
cells 


Units per 
10' ceils 


Nofi-uansfOTmed 
cells 


Units per 
10^ cells 


293 


19.0 


TNHF 


2.1 


KB 


14.0 


GM1604 


L7 






IMR90 {midlog) 


2.4 






IMR90 (quicaoent) 


UD 



FAX LINE PAGE 86 

Human DNA potymotasc or 

ization yidds a single 5.^^ band (Figure 4A) which is sut- 
ficient to encode a polypeptide of 165-180 kd. Northern 
hybridization of restriction fragments from eich of the over- 
lapping cDNA clones or the 5'-€nd of the >iear full-length 
cDNA clone ail result in a single mRNA hybridization signal 
of 5.8 kb. These results indicate there is ndthcr usage of 
multiple polyadenylation addition sites, generadng mRNA 
of variable length froni a single gene, nor \rarious splicing 
events or processing occurring to generate multiple mRNAs. 

The enzymatic activity and de novo prottiin synthesis of 
DNA polymerase a both correlate positively with cell pro- 
liferation (Bensch et al . 1982; Fry and Loeb, 1986; Thom- 
mes et al, 1986), To examine whether the transcriptional 
expression of polymerase a correlates with de novo protein 
synthesis and die expression of enzymatic activity, a parallel 
analysis of polymerase a steady-state mRNA levels from 
quiescent cells stimulated to proliferate was performed. 
Steady -state mRNA from normal human lung fibroblast cell 
culture (rMR-90), arrested in quiescent state (GO) by serum 
deprivation, and from cells activated to prolifiarate by serum 
stimulation were compared by Northern bliJt hybridization 
(Figure 4B). The steady-state message increases > 20-fold, 
18 h after serum stimulation^ 6 h prior to the peak of DNA 
synthesis. Meanwhile, the enzymatic activity per cell also 
increases 10-fold (Table E). Therefore, the increase of 
DNA polymerase a steady-state message following the ac- 
tivation of quiescent cells to proliferate is simflar to those 
observed with several other genes involved in DNA syn- 
thesis, that undergo transient expression. 

A comparative study of message levels in normal grow- 
ing human tissue and transformed cell lines was also per- 
formed. With equal amounts of polyCA)"*" mRNA, no 
detectable polymerase a message was found in RNA isolated 
from a normal 20-weck human placenta » in contrast to a 
readily detectable 5.8 kb signal from human breast car- 
cinoma cell line (MDA4) (Figure 4C), Genomic Southern 
blots of normal human cells and transformed cells indicate 
equal gene dosage of human DNA polymerase or (data not 
shown), indicating that the abundance of th*; polymerase a 
message in transformed cells is not due to tiie amplification 
of the polymerase a gene. 

SimSshttes with other DNA poiymarases 
Comparison of the primary amino acid sequence of human 
DNA polymerase a deduced from the nucleotide sequences 
of the cDNA clone to sequences derived from DNA poly- 
merases of herpes, Epstein -Barr, cytomegalo, vaccinia, 
adeno-2 viruses (Gibb et at . 1985; Earl etai, 1986; Larder 
et al., 1987; Kouzardies et ai, 1987), E.coli phage T^ 
(National Data Base Bank), Bacillus phage ^29 (Yoshikawa 
and Ito, 1982) and yeast DNA polymerase I (Johnson et ai , 
1985) reveals several regions of marked similarity. Within 
a 472-aminD-acid region of human DNA polymerase a 
(amino acids 609 - 108 1) six regions are identified that con- 
tain extensive similarities among these DNA polymerases. 
The regions are designated according to the extent of 
similarity from I to VI with region I being the most similar 
(Figure 5A), In addition to these six hi^jhly conserved 
regions, the sequence spanning region VI cif human DNA 
polymerase a, amino acids 908—939, shares —41% 
similarity with T4 gene 46 protein, which is exonuclcasc. 
The significance of these similarities is further underscored 
by the lelative location of diese regions within the respec- 

41 



Units of DNA pqlymerusc are defined as lOTiol of labeled dAMP in- 
cotporaiEd/h at 37*C in die presence and abaeticcs of 5 ;ig/ml 
aphidicolin. The results arc averagjes of duplicaie dctcnninaaons- UD 
reyrrcscnis Uftde»ctable background value. 

DNA, the struaural gene for human DNA polymerase a 
was mapped precisely to the previously determined expres- 
sion locus at Xp21.3 to Xp22.1 (Figure 3B), 

Analysis of steady-state mRNA during ceil prollfera- 
tfon and transfofmation 

Characterization of this c DNA insert by Northern hybrid- 
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; of human DNA polymerase a and other DNA polymcraacs. (A) Amino acid sojucnce simUarity betweca human DNA 



dashed line Gaps are indicated by dashes and extensive gap<s are marked by the number of amino acids contained withm the gap- Tht dMignated 
Z^ed «.£ic^ are marked by dashed lines under the amino acid residues. Amino acids 998-1005 .f Imman polymery « are defined as region 
I. amino ^i^ 839-tf78 are region D; amino acids 943 -9W are region ffl; amino acids 609-650 ar^ defined fj^^^^^^^ nl 
adiino acids 1075- 1081 and 909'-926, respectively. (B) Relative spatial arrangemni of the conserved resKWS of DNA poiymerasts. Earfi DNA 
polymerase polypeptide is repirsented by a straight line with NHj and COOH denoting the amino and carboxyl ternimi tespectivdy. The black bars 
represent the consensus sequences of each region. Similar regions of each polymerase polypeptide are aligned by vcmcal Imes. 
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FIB iS Conser^ sequences between human DNA polymerase a and hetpe* simplex virus I DNA pdymcrafic and prcdirtcd dCDxym^cteotide 
Ste;^ing^iSns!The six c^cd r^giona between human DNA « and herpe. simplex vlru. 1 al.gnedjd«iucal amin., acid residues are 
boxed. Arnino adds that were identified in herpes simplex vinis 1 mutants as singb ammo aad arfMunmon are boxed in shade. 
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Fig. 7. Amino acid sequences of possible DNA binding regions and schematic summary of parative ainciiona) domains of human DNA f:olym€rase 
a (A) Possible DNA binding sequences. Two regions of cys/his-rich sequences of human DNA polymerase, cc are depicted, region 1 from ammo 
acid 650-715 and region from amino acid 1245-1376. Cysteine and histidine residues are boxed in shade and amino acids capable o; inteTacting 
with the phosphate backbone of DNA iti the possible loop regions are marked by • . (B) Schematic represcmation of ihe puiative ^uncticnai domains 
of human DNA polymerase a and hydropaihy plot. 



tive polypeptides. The six regions are in the same linear 
spatial arrangement, IV-H-VI-III-I-V, on each poly- 
peptide (Figure 5B), However, the distances between each 
con^nsus region in the polypeptides examined are variable. 

Sequence comparisons of DNA polymerase j3 (Zmudzka 
€t al, 1986; Matsukage et aL, 1987), terminal transferase 
(Peterson et ai » 1984, J 985) and Kcoli DNA polymerase 
I (Joyce et al, 1982) reveal no significant similarity to the 
conserved regions described above, but a sequence similar 
to region II is ideniifieci in the dnoE gene product, the a 
subunit, of Kcoli DNA polymerase m (Tomasiewicz and 
McHenry, 1987), 

The presence of diese highly conserved domains m replica- 
tivc DNA polymerases from human to such phylogetietic- 
ally distant species as bacteriophage T4 and ^29 suggest 
that these DNA polymerases may all be derived from a com- 
mon primordial gene. 

Predicted functional domains 

Conservation of these sequences is likely to reflect the need 
to maintain function. A detailed comparison was made of 
the amino add sequences of human DNA polymerase a and 
herpes DNA polymerase (Figure 6). Extensive sequence 
similarity in all six regions are identified, with region I hav- 
ing the highest of 87,5%, followed by region n, 60% ; region 
V, 57%; region m,.47%; region IV, 26%; and region VI, 
10.5%. Muiaiions of heiTpes simplex vinis conferring altered 
anti-viral drug sensitivity have been mapped to several of 
these conserved regions of the herpes DNA polymerase gene 
(Knopf et ai , 1981; Coen ai, , 1983; Quiim and McGeoch, 
1985). Most mutants which demonstrate altered sensitivity 



to the pyrophosphate analog phosphonoacetic acid also ex- 
hibit more resistance to the nucleoside analog iiphidicoiin 
iCotnetai., L983; Larder 1987). Sequence analysis 
of several of these mutants, derived from a single '-'iral strain, 
confirms that all contain single amino acid substiiutions 
within conserved regions II and IH (Knopf. 1937; Larder 
et ai , 1987; Tsurumi et al. , 1987) and most recently, another 
mutant was identified having a single amino acid substitu- 
tion in region V (J.S.Gibbs> H-C.Chiou and D.M Coen, per- 
sonal communicadon). Mutations conferring altered 
sensitivity to these drugs are inferred to be at the dNTP and 
PPj binding domains. Based on the studies of herpes DNA 
polymerase mutants, regions II, 01 and V of human DNA 
polymerase a could be the essential catalytic domains re- 
quired for dNTP interaction. 

In eukaryolic cells, proteins involved in nucleic acid bind- 
ing or gene regulation were found to contain cysteine — 
histidine rich sequences that are potential meial-binding do- 
mains which may play an essential role in nuclei<: acid bind- 
ing and gene regulation (Miller J. et ai 1985; Btjrg, 1986). 
Two regions containing such a motif are found in the DNA 
polymerase or sequence (Figure 7). One region, amino acids 
650-715, contains the sequence Cys-Xs-His-Ajy-Cys-JiCi- 
Cys-XirCys-X3-His, where X represeni;s amino 
acids other than cysteine and histidine, and Cys/His-A;,- 
Cys/His is capable of forming a tetrahedral box structure 
with an extended protein loop (Figure 7A). The extended 
loop between residues 659 and 685 contains many amino 
acids with side chains capable of interacting:; wi* the 
phosphate backbone of DNA (Ohlendrof and Mathews, 
1983). Another cys/his-rich sequence within tho carboxyl- 
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tenniniis of human DNA polymerase a sequence, from 
amino acid 1 245-- 1376, is identified (Figure 7B), This 
region contains the sequence His-Xj-His-Xis-Cys-Jtj-Cys- 
X-Cys-XrCys-X23-Cys-X4-Cys-X32-Cys-X4-Cy5-X,7-Cys- 
X2-Cy5, which has the potenlia! to form three ictrahedral 
box structures defining three extended DNA binding loops 
as described above. Figure 7C summarizes these possible 
functional domains. 

Discussion 

Genetic smdies of yeast and several somatic cell lines im- 
plies that diere are specific regulatory or restriction points 
in the growth cycle of the cell (Hartwell et ai, , 1974; 
Sinunoviich and Thompson, 1978). Several genes involved 
in DNA synthesis such as thymidine kinase (tk) (Groudine 
andCasimir, 1984; Coppock and Pardee, 1987), diymidylate 
synthetase (ts) (Storm et aL, 1984; Ayusawa et al . 1986) 
anddihydrofolate reducta&e (dhfr) (Famham and Schimke, 
1985) undergo transient expression in the cell cycle. DNA 
polymerase a is the principsU enzyme chat replicates chromo- 
sCTnal DNA. The steady-state level of polymerase a message 
increases when cells are activated to proiiferaie and corrdaies 
wifli the increase of enzymatic activity and de novo synthesis 
of antigenic protein. The concerted increase of these diree 
parameters implies the regulation of the expression of this 
key DNA replication enzyme is at the transcriptional level. 
A gene that exhibits transient increase in expression during 
cell cycle or activation to proliferate could be a regulator 
of a restriction point, or the target of a cell-cycle or cell- 
proliferation-specific regulatory signal. Thus far we have 
only analyzed the proliferation-associated steady-state 
message of polymerase or. The results imply thai expres- 
sion of polymerase a may be restricted to cells entering the 
cell cycle. The transient increase in polymerase oc mRNA 
foUovwng the stimulation of quiescent cells could be the result 
of an activational event that renders the cells competent to 
induce transcription of DNA polymerase of and subsequently 
initiate DNA synthesis. To further understand the transcrip- 
tional regulation of this gene, investigation of the nuclear 
transcription and steady^state message of DNA polymerase 
01 during activation of quiescent cells to proliferate, as well 
as within cell cycle, is necessary. The observation of signifi- 
cant amplification of steady-state polymerase a mRNA, de 
novo protein synthesis and enzymatic activity in transform- 
ed cells as compared to non-transformed cells, poses ^e 
question whether DNA polymerase a is a target for oncogene 
activation. 

Recent studies from various eukaryotic systems indicate 
that, in vitro, the catalytic polypeptide of DNA polymerase 
a is a polypeptide of 180 kd (Campbell, 1986; Wong et al , 
1986). The present amino acid sequence, deduced from the 
cDNA sequence, demonstrates that the catalytic polypep- 
tide of human DNA polymerase ot has a minimum moL wt of 
165 led. The discrepancy between this value and the 180 kd 
polypeptide found in DNA polymerase a enzyme prepara- 
tions suggests either there exists an additional 15 kd of 
coding sequence further upstream from the putative transla- 
tion start site, or that a post-tianslational modification reduces 
its size. Compared to tfie yeast DNA polymerase I gene 
(Johnson et ai, 1985) which encodes a biologically frmc-^ 
tional enzyme of 140 kd, a recombinant human polymerase 
a of 165 kd may represent the fliJMength polypeptide. 
Analysis of a recently isolated genomic clone containing the 
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polymerase a will define the 
transcriptional and translational initiation sites as well as the 
primary translation product, 

Stmciure-function studies of DNA polymerase oc are pre- 
requisite to understanding the mechanisms by which it in- 
teracts with otiier DNA replication proteins, dNTP and DNA 
substrates. Analysis of the primary amino acid sequence 
deduced ftxim the cDNA has identified several structural 
features which are similar to other nucleic-acid-inieracting 
proteins and DNA polymerases. The identification of cys/his- 
rich motifs defines putative DNA binding domains. The 
presence of highly conserved regions sharing similarity with 
other DNA polymerases and having simibr. although not 
identical, spatial relationships suggests that tlicse regions are 
needed to maintain essential function. SecDodary stucmre 
predictions for each of the three highly conserved regions, 
J, n and in, indicates diat each composes a turn (cleft). Four 
regions, II, VI. IH and I, are localized within a contiguous 
region that comprises ony 9 % of the total lengdi of the poly- 
peptide. This close proximity prompts us to speculate that 
iese domains may form a substrate binding site on the en- 
zyme surface. Experimental evidence based on genetic 
marker transfer and rescue data of herpes DNA polymer- 
ase mutants (Knopf et ai, 1981; Coen etaL, 1983; Gibb 
et ai, 1985; Quinn and McGeoch, 1985; Tsunimi et ai, 
1987) implicates regions II, HI and possibly region V in sub- 
strate dNTP binding and ppi hydrolysis. Mutations clustered 
in locations such as region H and HI suggest a role for these 
two sequences in substrate recognition or catalysis. Muta- 
tion in region V, which is > 100 amino acids apart from 
regions 11 and III, results in similar drug rriistance pheno- 
types, suggesting polypeptide folding interactions that form 
substrate binding sites. The identification of a region IMike 
sequence in the a subunit of E. call DNA poi ymerase DI fur- 
ther substantiates the functional importancn of tiiis region. 
Thus for there are no regions I, IV and VI mutants isolated 
from herpes vims. These regions may be the critical domain.^ 
required for interaction with other accessory proteins in DNA 
replication or for substrate interaction. The biological func- 
tion of each of diese conserved domains shcjuld be definablt 
by site-specific mutagenesis and interchanging host and viral 
polymerase structural determinants. 

It is interesting to note that all DNA polymerases con- 
taining ±esc conserved regions identified in this study are 
Implicative enzymes. In addition, except for DNA polymer 
ase a and yeast polymerase I, aH of the viral DNA poly 
merases and the two bacteriophage DNA polymerases havu- 
two enzymological activities; a DNA polymerizing activit>* 
and a 3' -5' proof reading exonuclease activity. Since error- 
free DNA replication is an essential process for the survival 
of biological organisms, one might expect that the proof 
reading function would be conserved in this key chromo- 
somal replication enzyme from prokaryotes to eukaiyotes. 
This again raises the issue of the relationship between DNA 
polymerases a and 5. Does mammalian polymerase oc, like 
Drosophila melartogaster polymerase a (Cotterill et ai . 
1987), have an intrinsic but cryptic 3' -5' exonuclease ac- 
tivity in the catalytic polypeptide, detectable only wheii 
separated from othw subunits? Enzymological characteriza- 
tion of the functionally expressed polymerase a catalytic 
polyp^tide should provide an answer to this issue. 

The lack of these six conserved sequenctjs in E.coli poly - 
merase m, the chromosomal replicative DNA polymerase , 
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suggests DNA polymerase a and Ecoti polymerase ffl 
evolved from different ancestral. genes. These six smular 
regions in replicadve DNA polymerases are conserved to 
phylogeneticaUy distant species; however, they are notably 
absent in several eukaryotic DNA polymerizmg enzymes: 
DNA polymerase (Zmudzka et al., 1986; Matsulcage et 
al 1987) terminal transferase (Pfeterson et aL . 1985), retro- 
viral reverse transcriptase (Kamer and AiiOS, 1984). or pro- 
karyatic ELcoti DNA polyrrerases I (Joyce et aL , 1982) and 
m (Tomasiewicz and McHenry, 1987). This suggests that 
there is a class of DNA polymerases which are aD replicaUve 
DNA polymerases, containing these conserved regions and 
sharing a primordial archetype. 

Materials and methods 

Prenaraoan of ptasmid, lesirictioTi enzyme (figestions and agarose gci etec^ 
Swer. Jrfonned as by Manud. *'«5jl^8^^;ONA 

were labeled wilh "P by Uie methods of Feicberg and Vogelstetn (1983), 
synthetic oligoftudeoode probes were labeled by polyiaicicoudc kinase 
^ described by NianiatiJ €t at. (1982). Anti-sense libonucleoadc probe was 
pi«p$red AS described (Melton et al » 1984). 

KB is a human epidermoid cardnomft cell line; IMR90 is a narmal humaji 
fetal lung fibroblask cdl line; 293 is a human cmbryomc kidney cell Ime 
rmnsformed with sheaied adenovirus DNA; MDA4 b er^nfifomied human 
cell line propagated from huTTian mammary cardrttnna. These cell tines 
described abc^ were from ATCC. RockvUle, MD. GM1604 is a nannal 
human feta3 lung fibroblast cell line and GMMQIA is himifin cell line con- 
taining 4X chromosomes, both these cell lines are from NlGMS Human 
Cell ReposittKy, Camden, NJ. TNHF is a normal human fibroblast pnmarv 
culture from neonatal foreskin developed by Dr Eric Stanbridge of Umvmiiy 
California, Irvine, CA. All rodent -human somatic hybrids used m gene 
mapping were as previously described (Wang et at, 1985). 

/sotolfon of human QUA patYmer99e a cat^yth polypeptides 
DNA polynM^rasc a antigen polypeptides from six 18 \ cultures of human 
KB cells (3-5 x 10^ celb/nil) were purified with a monoclonal IgG 
{SJK2R7)-Scpharose 4B colutnti as described (Wong et ai. , 1986). The 
polypeptides vwre suspended in 0.36 M Tris-HCl, pH ».6, 3.3 mM EDTA. 

and then reduced for 3 h at 37*C under N, with 10 mM DTT 
The reduocd polypeptides were alkylated with 22 mM lodoacctic acid at 
4*C for 1 h (md d^2«d in 50 mM NH^ICO,, 0.01 % SDS. The dieiyz- 
ed reduced and alkylated DNA polymentse a protcm was ly<yhiliMil, 
^iS^ in IW^ NaPO.. pH 6.5, ^ ^''.fj^^^f^^ 
lyCtor 10 min. These polypeptides weic then punfied by HPLC thrw^ 
two coupled gel permeation columns (TSK 3000, 7.5 x 300 nun) m 
100 mM NaPO-. pH 6.5. and 0,1% SDS at a flow rate of 0.5 mlW 
m absorbanoe of the ehiate was momtonsd at 280 nm. Fiactions coiuam- 
mg the 180-^ 140 lal DNA polymeiBSc a catalytic pcrfypcptides were difltyzEd 
in 50raM NH4HC0a ccmtaJnitig 0.01% SDS and lyophiiized. 

Human DNA polymerase ot catalytic polypeptides (500 pmol), isolated a» 
described above, were resuspended in H^O and eihanol'prw:ipiteied wice 
to remove eitcess. SDS from the samples. The poiypepUdes were then 
resuspended in 0.1 M MH4HCO3. 10 mM CaCl, and dig«ted wrth 2 
of TPOL treated tmsin at room tempcramre fw 20 h. The oypam digesusd 
oa ^ Aqtmpom RP300 <2.l x 220 mm 
BrowQlec L^) HPLC column eqtiUibrated in 0. 1 % tnfluoroacetic aad. A 
linear gradiert from 0-60% aceionitrife was nm over 45 njin at 0 2 mi/iran, 
Ahsorhanw « 220 nm was monitored by Specinifiow 755 V^^^^ Wav^ 
Icng* detector. Selected peptides peaks were further punficd by an 8P3M 
(1 X 100 mm) cbhuiin equilibrated in 50 (ttM ammonuim «»taie, pH 6,5. 
A linear Eadient of 0-75% acetonilrile wss nm over aOtimi at 0.08 mWmm 
andabsOTbancctnonitor«lat215 nm. Each of tjie separated pcpt^ was 
subjected to auimnated Edman degradation performed on a ««»d 470A 
gas phase scquoicer with on-line PTH amino acid anlaysis (Model 120A) 
(HunkapiUer ei at, 1983)- 



Single long anti-sense ciigonucleodde probes were dcsigi>5d ac^^f 
La^ (1985) and were syndieskcd on an Applied Biosyste:™ model 380A 
oiifionucteoiide synthesizer. 

Ninety /lb of poty(A)+ mRNA from early mid-log humfcn KB cells was 
heated at 65°C for I min and loaded onto a 5.3 ml sucixmc gradient of 
5^25% containing 100 mM NaCI, 10 mM Tris-HQ, pH 7.4 I rnM 
EIXTA and 0 I % SDS. Centriftigation was carried out at 5.1 300 ^ for 25 h 
at 5 'C and fractionated into 20 fractions. niRNA samples of each fraction 
oligonucteotide pmbes. Hybridization condiaocis used were x SSPE, 0, 1% 
SDS 100 «g/ml Lcoli tRNA. Washing conditions were 2 X SSPE. 0. 1 % 
SDS Temperature of hybridizanon and washing depended w the individual 
oUaonucIcotide probe used. Stringency of hybridization and washing of each 
individual oligonucleotide probe was based on r„ (mdi.ng temperature) 
Td {washing temperamie) values esdmated at >85^ probe-target 

homoloay (Lathe, 1985), . . . ^ . 

sSgof 1 X io5coJowMofthissi2e.«lecie<JUb«r:; yielded asuigie 

dbdnct positive clone dcjignatai as pcD-KBpo/ot, which hybnd)3rs wilh 
oligodeoTnuctorida TTM. 7765^X25 (TM. ^-^^^^^'^ 
of pcD-KBrwfa (Figure 2) indicates tha it contains a 289W>p cDNA in- 
sert with aTopen readhig frame of 1865 bp tCTmii)»ted by a st^ codon 
and^foU«wed^>y a 102S-bp noiKoding region. In tWs l?&S-bp codmg se- 
quence there are four legjonj of deduced amino acid wqucnces dat are 
SeifecUy homologous to the previously (fctcrtnined aimtu) acid iequences, 
T764 T265 T2S and T9 (Table D- The 3 '-noo-tianslate J region cwitains 
M!ve«4l in-frime stop codons. aj>d the consensus polyadenylatioo signal 

the potyadenytalion ail. TTiis indicates that pcD-K^pote e wums the J -end 
erf tiWcDNA for human DNA poly™rra« «. To eMend tnujMtrt cDNA 
clone the S'-mos. restriction fragn«n, of pcD-KBpota, 
used to screen 2 X I0» phage of a human pte-B cell cDNA library (E 
libntry)oor.stfucted in XgtlO (Clcaiy etoL. 1986). The veiy 5 -tenrnnal 
resoSto fh«men.» of the newly extended cDN A doties v caused to funhe 
screen the El library. The complete set of overlapping clones was sequenced 
,n both directions as described (Dale rt ^. . 1985) and rea.«embled. 

Genome DNA and RNA blot bybdiBza^on 

Gennmc DNA hybrii^tion. Five or 10 of human scnotnic DNA 
were digested with EaM. DNA blot hybrid Izarion wa. camaJ (Wt wiilh 
SO n« of »P-labeled ?uUP:A 70O-hp fragment of i)cD-IiBj>rrf« (liT 
c.p.m.^MS). Hybridization was at 6 x SSC. 50 mM MaP04, pH 7^0 S 
■A Denhflfdt soluiion, 100 boiled and sonicated sal-non sperm DNA 
50ft formamide and 10* denUan sulfate at 42*C. Th.: blot was washed 
inO.2 ^ SSC.0.1% SDSaltSS-C. 

hybridiu>iion. Polyodenylated mRNA was analyze..! on n 1 % agarose 
gel in formaldfihy*. Northern blot hybridiiatjon was c»ned out *ith 
50^ of »P-t»«>eled i«tricdoa fragment of cDNA (lO' c.p.mJxg). 
Hybridization and wash were as described above. 

Amino acid SBquence of otHar ONA polymerases 

Nucleotide sequence from yeast DNA prfynnerase I wai determined from 

sequence analysis of a fi^W/flwdlU reairiction fiagrr.ertt of yeast DNA 

^ymerase I gene (Johnson e. oL . 1985). Oth« viral DNA Po"y™rMe 

»<we»ies and T, gene 46 setpwnce were derived eiiiier from publuihed 

E^Liooal ft«k of the National BionwUcal Research Foun- 

dation. 
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